Dataset statistics
| Number of variables | 27 |
|---|---|
| Number of observations | 396030 |
| Missing cells | 81589 |
| Missing cells (%) | 0.8% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 81.6 MiB |
| Average record size in memory | 216.0 B |
Variable types
| Numeric | 12 |
|---|---|
| Categorical | 15 |
emp_title has a high cardinality: 173105 distinct values | High cardinality |
issue_d has a high cardinality: 115 distinct values | High cardinality |
title has a high cardinality: 48817 distinct values | High cardinality |
earliest_cr_line has a high cardinality: 684 distinct values | High cardinality |
address has a high cardinality: 393700 distinct values | High cardinality |
loan_amnt is highly correlated with term and 1 other fields | High correlation |
installment is highly correlated with loan_amnt | High correlation |
open_acc is highly correlated with total_acc | High correlation |
pub_rec is highly correlated with pub_rec_bankruptcies | High correlation |
total_acc is highly correlated with open_acc | High correlation |
pub_rec_bankruptcies is highly correlated with pub_rec | High correlation |
sub_grade is highly correlated with term and 2 other fields | High correlation |
grade is highly correlated with int_rate and 1 other fields | High correlation |
term is highly correlated with loan_amnt and 2 other fields | High correlation |
int_rate is highly correlated with term and 2 other fields | High correlation |
emp_title has 22927 (5.8%) missing values | Missing |
emp_length has 18301 (4.6%) missing values | Missing |
mort_acc has 37795 (9.5%) missing values | Missing |
annual_inc is highly skewed (γ1 = 41.04272475) | Skewed |
dti is highly skewed (γ1 = 431.0512254) | Skewed |
address is uniformly distributed | Uniform |
pub_rec has 338272 (85.4%) zeros | Zeros |
mort_acc has 139777 (35.3%) zeros | Zeros |
pub_rec_bankruptcies has 350380 (88.5%) zeros | Zeros |
Reproduction
| Analysis started | 2022-11-29 17:37:52.294375 |
|---|---|
| Analysis finished | 2022-11-29 17:39:18.417791 |
| Duration | 1 minute and 26.12 seconds |
| Software version | pandas-profiling v3.3.0 |
| Download configuration | config.json |
| Distinct | 1397 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14113.88809 |
| Minimum | 500 |
|---|---|
| Maximum | 40000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.0 MiB |
Quantile statistics
| Minimum | 500 |
|---|---|
| 5-th percentile | 3250 |
| Q1 | 8000 |
| median | 12000 |
| Q3 | 20000 |
| 95-th percentile | 30975 |
| Maximum | 40000 |
| Range | 39500 |
| Interquartile range (IQR) | 12000 |
Descriptive statistics
| Standard deviation | 8357.441341 |
|---|---|
| Coefficient of variation (CV) | 0.5921430926 |
| Kurtosis | -0.06259753499 |
| Mean | 14113.88809 |
| Median Absolute Deviation (MAD) | 5500 |
| Skewness | 0.7772854671 |
| Sum | 5589523100 |
| Variance | 69846825.77 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10000 | 27668 | 7.0% |
| 12000 | 21366 | 5.4% |
| 15000 | 19903 | 5.0% |
| 20000 | 18969 | 4.8% |
| 35000 | 14576 | 3.7% |
| 8000 | 13539 | 3.4% |
| 6000 | 12734 | 3.2% |
| 5000 | 12443 | 3.1% |
| 16000 | 10129 | 2.6% |
| 18000 | 9195 | 2.3% |
| Other values (1387) | 235508 |
| Value | Count | Frequency (%) |
| 500 | 4 | < 0.1% |
| 700 | 1 | < 0.1% |
| 725 | 1 | < 0.1% |
| 750 | 1 | < 0.1% |
| 800 | 1 | < 0.1% |
| 900 | 1 | < 0.1% |
| 950 | 1 | < 0.1% |
| 1000 | 1448 | |
| 1025 | 4 | < 0.1% |
| 1050 | 10 | < 0.1% |
| Value | Count | Frequency (%) |
| 40000 | 180 | |
| 39700 | 1 | < 0.1% |
| 39600 | 1 | < 0.1% |
| 39500 | 1 | < 0.1% |
| 39475 | 1 | < 0.1% |
| 39200 | 1 | < 0.1% |
| 38825 | 1 | < 0.1% |
| 38750 | 1 | < 0.1% |
| 38475 | 1 | < 0.1% |
| 38300 | 1 | < 0.1% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.0 MiB |
| 36 months | |
|---|---|
| 60 months |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 3960300 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 36 months |
|---|---|
| 2nd row | 36 months |
| 3rd row | 36 months |
| 4th row | 36 months |
| 5th row | 60 months |
Common Values
| Value | Count | Frequency (%) |
| 36 months | 302005 | |
| 60 months | 94025 | 23.7% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| months | 396030 | |
| 36 | 302005 | |
| 60 | 94025 | 11.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| 792060 | ||
| 6 | 396030 | |
| m | 396030 | |
| o | 396030 | |
| n | 396030 | |
| t | 396030 | |
| h | 396030 | |
| s | 396030 | |
| 3 | 302005 | 7.6% |
| 0 | 94025 | 2.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2376180 | |
| Space Separator | 792060 | 20.0% |
| Decimal Number | 792060 | 20.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| m | 396030 | |
| o | 396030 | |
| n | 396030 | |
| t | 396030 | |
| h | 396030 | |
| s | 396030 |
Decimal Number
| Value | Count | Frequency (%) |
| 6 | 396030 | |
| 3 | 302005 | |
| 0 | 94025 | 11.9% |
Space Separator
| Value | Count | Frequency (%) |
| 792060 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2376180 | |
| Common | 1584120 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| m | 396030 | |
| o | 396030 | |
| n | 396030 | |
| t | 396030 | |
| h | 396030 | |
| s | 396030 |
Common
| Value | Count | Frequency (%) |
| 792060 | ||
| 6 | 396030 | |
| 3 | 302005 | 19.1% |
| 0 | 94025 | 5.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3960300 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 792060 | ||
| 6 | 396030 | |
| m | 396030 | |
| o | 396030 | |
| n | 396030 | |
| t | 396030 | |
| h | 396030 | |
| s | 396030 | |
| 3 | 302005 | 7.6% |
| 0 | 94025 | 2.4% |
| Distinct | 566 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.63940005 |
| Minimum | 5.32 |
|---|---|
| Maximum | 30.99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.0 MiB |
Quantile statistics
| Minimum | 5.32 |
|---|---|
| 5-th percentile | 6.89 |
| Q1 | 10.49 |
| median | 13.33 |
| Q3 | 16.49 |
| 95-th percentile | 21.97 |
| Maximum | 30.99 |
| Range | 25.67 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 4.472157382 |
|---|---|
| Coefficient of variation (CV) | 0.3278851978 |
| Kurtosis | -0.1439465381 |
| Mean | 13.63940005 |
| Median Absolute Deviation (MAD) | 3.08 |
| Skewness | 0.420669472 |
| Sum | 5401611.6 |
| Variance | 20.00019165 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10.99 | 12411 | 3.1% |
| 12.99 | 9632 | 2.4% |
| 15.61 | 9350 | 2.4% |
| 11.99 | 8582 | 2.2% |
| 8.9 | 8019 | 2.0% |
| 12.12 | 7358 | 1.9% |
| 7.9 | 7332 | 1.9% |
| 16.29 | 6632 | 1.7% |
| 13.11 | 6580 | 1.7% |
| 6.03 | 6291 | 1.6% |
| Other values (556) | 313843 |
| Value | Count | Frequency (%) |
| 5.32 | 2440 | 0.6% |
| 5.42 | 465 | 0.1% |
| 5.79 | 333 | 0.1% |
| 5.93 | 431 | 0.1% |
| 5.99 | 278 | 0.1% |
| 6 | 70 | < 0.1% |
| 6.03 | 6291 | |
| 6.17 | 220 | 0.1% |
| 6.24 | 1184 | 0.3% |
| 6.39 | 656 | 0.2% |
| Value | Count | Frequency (%) |
| 30.99 | 13 | |
| 30.94 | 3 | < 0.1% |
| 30.89 | 3 | < 0.1% |
| 30.84 | 1 | < 0.1% |
| 30.79 | 9 | |
| 30.74 | 4 | < 0.1% |
| 30.49 | 5 | < 0.1% |
| 29.99 | 7 | |
| 29.96 | 8 | |
| 29.67 | 15 |
| Distinct | 55706 |
|---|---|
| Distinct (%) | 14.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 431.849698 |
| Minimum | 16.08 |
|---|---|
| Maximum | 1533.81 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.0 MiB |
Quantile statistics
| Minimum | 16.08 |
|---|---|
| 5-th percentile | 109.51 |
| Q1 | 250.33 |
| median | 375.43 |
| Q3 | 567.3 |
| 95-th percentile | 925.6 |
| Maximum | 1533.81 |
| Range | 1517.73 |
| Interquartile range (IQR) | 316.97 |
Descriptive statistics
| Standard deviation | 250.7277895 |
|---|---|
| Coefficient of variation (CV) | 0.5805904014 |
| Kurtosis | 0.7838199213 |
| Mean | 431.849698 |
| Median Absolute Deviation (MAD) | 150.5 |
| Skewness | 0.9835981609 |
| Sum | 171025435.9 |
| Variance | 62864.42443 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 327.34 | 968 | 0.2% |
| 332.1 | 791 | 0.2% |
| 491.01 | 736 | 0.2% |
| 336.9 | 686 | 0.2% |
| 392.81 | 683 | 0.2% |
| 332.72 | 641 | 0.2% |
| 337.47 | 624 | 0.2% |
| 317.54 | 574 | 0.1% |
| 654.68 | 556 | 0.1% |
| 261.88 | 527 | 0.1% |
| Other values (55696) | 389244 |
| Value | Count | Frequency (%) |
| 16.08 | 1 | |
| 16.25 | 1 | |
| 16.31 | 1 | |
| 16.47 | 1 | |
| 19.87 | 1 | |
| 20.22 | 1 | |
| 21.25 | 1 | |
| 21.62 | 1 | |
| 21.99 | 1 | |
| 22.24 | 1 |
| Value | Count | Frequency (%) |
| 1533.81 | 1 | |
| 1527 | 1 | |
| 1503.85 | 1 | |
| 1479.49 | 1 | |
| 1464.42 | 1 | |
| 1458.25 | 1 | |
| 1451.14 | 2 | |
| 1451.12 | 2 | |
| 1445.9 | 1 | |
| 1443.76 | 1 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.0 MiB |
| B | |
|---|---|
| C | |
| A | |
| D | |
| E | |
| Other values (2) |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 396030 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | B |
|---|---|
| 2nd row | B |
| 3rd row | B |
| 4th row | A |
| 5th row | C |
Common Values
| Value | Count | Frequency (%) |
| B | 116018 | |
| C | 105987 | |
| A | 64187 | |
| D | 63524 | |
| E | 31488 | 8.0% |
| F | 11772 | 3.0% |
| G | 3054 | 0.8% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| b | 116018 | |
| c | 105987 | |
| a | 64187 | |
| d | 63524 | |
| e | 31488 | 8.0% |
| f | 11772 | 3.0% |
| g | 3054 | 0.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| B | 116018 | |
| C | 105987 | |
| A | 64187 | |
| D | 63524 | |
| E | 31488 | 8.0% |
| F | 11772 | 3.0% |
| G | 3054 | 0.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 396030 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| B | 116018 | |
| C | 105987 | |
| A | 64187 | |
| D | 63524 | |
| E | 31488 | 8.0% |
| F | 11772 | 3.0% |
| G | 3054 | 0.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 396030 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| B | 116018 | |
| C | 105987 | |
| A | 64187 | |
| D | 63524 | |
| E | 31488 | 8.0% |
| F | 11772 | 3.0% |
| G | 3054 | 0.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 396030 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| B | 116018 | |
| C | 105987 | |
| A | 64187 | |
| D | 63524 | |
| E | 31488 | 8.0% |
| F | 11772 | 3.0% |
| G | 3054 | 0.8% |
| Distinct | 35 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.0 MiB |
| B3 | 26655 |
|---|---|
| B4 | 25601 |
| C1 | 23662 |
| C2 | 22580 |
| B2 | 22495 |
| Other values (30) |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 792060 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | B4 |
|---|---|
| 2nd row | B5 |
| 3rd row | B3 |
| 4th row | A2 |
| 5th row | C5 |
Common Values
| Value | Count | Frequency (%) |
| B3 | 26655 | 6.7% |
| B4 | 25601 | 6.5% |
| C1 | 23662 | 6.0% |
| C2 | 22580 | 5.7% |
| B2 | 22495 | 5.7% |
| B5 | 22085 | 5.6% |
| C3 | 21221 | 5.4% |
| C4 | 20280 | 5.1% |
| B1 | 19182 | 4.8% |
| A5 | 18526 | 4.7% |
| Other values (25) | 173743 |
Length
| Value | Count | Frequency (%) |
| b3 | 26655 | 6.7% |
| b4 | 25601 | 6.5% |
| c1 | 23662 | 6.0% |
| c2 | 22580 | 5.7% |
| b2 | 22495 | 5.7% |
| b5 | 22085 | 5.6% |
| c3 | 21221 | 5.4% |
| c4 | 20280 | 5.1% |
| b1 | 19182 | 4.8% |
| a5 | 18526 | 4.7% |
| Other values (25) | 173743 |
Most occurring characters
| Value | Count | Frequency (%) |
| B | 116018 | |
| C | 105987 | |
| 1 | 81077 | |
| 4 | 80849 | |
| 3 | 79720 | |
| 2 | 79544 | |
| 5 | 74840 | |
| A | 64187 | |
| D | 63524 | |
| E | 31488 | 4.0% |
| Other values (2) | 14826 | 1.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 396030 | |
| Decimal Number | 396030 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| B | 116018 | |
| C | 105987 | |
| A | 64187 | |
| D | 63524 | |
| E | 31488 | 8.0% |
| F | 11772 | 3.0% |
| G | 3054 | 0.8% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 81077 | |
| 4 | 80849 | |
| 3 | 79720 | |
| 2 | 79544 | |
| 5 | 74840 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 396030 | |
| Common | 396030 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| B | 116018 | |
| C | 105987 | |
| A | 64187 | |
| D | 63524 | |
| E | 31488 | 8.0% |
| F | 11772 | 3.0% |
| G | 3054 | 0.8% |
Common
| Value | Count | Frequency (%) |
| 1 | 81077 | |
| 4 | 80849 | |
| 3 | 79720 | |
| 2 | 79544 | |
| 5 | 74840 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 792060 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| B | 116018 | |
| C | 105987 | |
| 1 | 81077 | |
| 4 | 80849 | |
| 3 | 79720 | |
| 2 | 79544 | |
| 5 | 74840 | |
| A | 64187 | |
| D | 63524 | |
| E | 31488 | 4.0% |
| Other values (2) | 14826 | 1.9% |
| Distinct | 173105 |
|---|---|
| Distinct (%) | 46.4% |
| Missing | 22927 |
| Missing (%) | 5.8% |
| Memory size | 3.0 MiB |
| Teacher | 4389 |
|---|---|
| Manager | 4250 |
| Registered Nurse | 1856 |
| RN | 1846 |
| Supervisor | 1830 |
| Other values (173100) |
Length
| Max length | 78 |
|---|---|
| Median length | 56 |
| Mean length | 16.5867361 |
| Min length | 1 |
Characters and Unicode
| Total characters | 6188561 |
|---|---|
| Distinct characters | 125 |
| Distinct categories | 17 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 3 ? |
Unique
| Unique | 145247 ? |
|---|---|
| Unique (%) | 38.9% |
Sample
| 1st row | Marketing |
|---|---|
| 2nd row | Credit analyst |
| 3rd row | Statistician |
| 4th row | Client Advocate |
| 5th row | Destiny Management Inc. |
Common Values
| Value | Count | Frequency (%) |
| Teacher | 4389 | 1.1% |
| Manager | 4250 | 1.1% |
| Registered Nurse | 1856 | 0.5% |
| RN | 1846 | 0.5% |
| Supervisor | 1830 | 0.5% |
| Sales | 1638 | 0.4% |
| Project Manager | 1505 | 0.4% |
| Owner | 1410 | 0.4% |
| Driver | 1339 | 0.3% |
| Office Manager | 1218 | 0.3% |
| Other values (173095) | 351822 | |
| (Missing) | 22927 | 5.8% |
Length
| Value | Count | Frequency (%) |
| manager | 39270 | 4.7% |
| of | 15802 | 1.9% |
| inc | 10469 | 1.2% |
| director | 9837 | 1.2% |
| sales | 9635 | 1.1% |
| assistant | 9259 | 1.1% |
| analyst | 7652 | 0.9% |
| specialist | 7627 | 0.9% |
| supervisor | 7501 | 0.9% |
| engineer | 7462 | 0.9% |
| Other values (55359) | 717784 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 606206 | 9.8% |
| 487836 | 7.9% | |
| r | 470449 | 7.6% |
| a | 455384 | 7.4% |
| i | 406094 | 6.6% |
| n | 405205 | 6.5% |
| t | 373457 | 6.0% |
| o | 330975 | 5.3% |
| s | 293945 | 4.7% |
| c | 244175 | 3.9% |
| Other values (115) | 2114835 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4701345 | |
| Uppercase Letter | 939905 | 15.2% |
| Space Separator | 487839 | 7.9% |
| Other Punctuation | 45687 | 0.7% |
| Decimal Number | 6383 | 0.1% |
| Dash Punctuation | 5541 | 0.1% |
| Open Punctuation | 847 | < 0.1% |
| Close Punctuation | 821 | < 0.1% |
| Math Symbol | 114 | < 0.1% |
| Control | 30 | < 0.1% |
| Other values (7) | 49 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 606206 | |
| r | 470449 | |
| a | 455384 | |
| i | 406094 | |
| n | 405205 | |
| t | 373457 | |
| o | 330975 | 7.0% |
| s | 293945 | 6.3% |
| c | 244175 | 5.2% |
| l | 200056 | 4.3% |
| Other values (23) | 915399 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 115480 | |
| C | 93703 | 10.0% |
| A | 87717 | 9.3% |
| M | 73150 | 7.8% |
| P | 57688 | 6.1% |
| T | 54371 | 5.8% |
| E | 50241 | 5.3% |
| I | 49183 | 5.2% |
| R | 48438 | 5.2% |
| D | 45784 | 4.9% |
| Other values (20) | 264150 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 18744 | |
| , | 9790 | |
| / | 7921 | |
| & | 6352 | 13.9% |
| ' | 2523 | 5.5% |
| # | 140 | 0.3% |
| ; | 45 | 0.1% |
| : | 43 | 0.1% |
| ! | 31 | 0.1% |
| " | 28 | 0.1% |
| Other values (7) | 70 | 0.2% |
Control
| Value | Count | Frequency (%) |
| | 8 | |
| | 7 | |
| | 3 | 10.0% |
| 2 | 6.7% | |
| | 2 | 6.7% |
| | 2 | 6.7% |
| | 2 | 6.7% |
| | 1 | 3.3% |
| | 1 | 3.3% |
| 1 | 3.3% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 1428 | |
| 2 | 1303 | |
| 3 | 1002 | |
| 4 | 548 | 8.6% |
| 0 | 435 | 6.8% |
| 5 | 409 | 6.4% |
| 6 | 385 | 6.0% |
| 9 | 335 | 5.2% |
| 7 | 321 | 5.0% |
| 8 | 217 | 3.4% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 92 | |
| | | 16 | 14.0% |
| ~ | 4 | 3.5% |
| ¬ | 1 | 0.9% |
| < | 1 | 0.9% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 839 | |
| [ | 7 | 0.8% |
| { | 1 | 0.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 816 | |
| ] | 4 | 0.5% |
| } | 1 | 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 487836 | ||
| 3 | < 0.1% |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 8 | |
| ¢ | 3 | 27.3% |
Other Number
| Value | Count | Frequency (%) |
| ² | 3 | |
| ³ | 1 | 25.0% |
Format
| Value | Count | Frequency (%) |
| | 1 | |
| | 1 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 5541 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 18 |
Other Symbol
| Value | Count | Frequency (%) |
| © | 7 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 6 |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5641250 | |
| Common | 547311 | 8.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 606206 | 10.7% |
| r | 470449 | 8.3% |
| a | 455384 | 8.1% |
| i | 406094 | 7.2% |
| n | 405205 | 7.2% |
| t | 373457 | 6.6% |
| o | 330975 | 5.9% |
| s | 293945 | 5.2% |
| c | 244175 | 4.3% |
| l | 200056 | 3.5% |
| Other values (53) | 1855304 |
Common
| Value | Count | Frequency (%) |
| 487836 | ||
| . | 18744 | 3.4% |
| , | 9790 | 1.8% |
| / | 7921 | 1.4% |
| & | 6352 | 1.2% |
| - | 5541 | 1.0% |
| ' | 2523 | 0.5% |
| 1 | 1428 | 0.3% |
| 2 | 1303 | 0.2% |
| 3 | 1002 | 0.2% |
| Other values (52) | 4871 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6188456 | |
| None | 103 | < 0.1% |
| Punctuation | 2 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 606206 | 9.8% |
| 487836 | 7.9% | |
| r | 470449 | 7.6% |
| a | 455384 | 7.4% |
| i | 406094 | 6.6% |
| n | 405205 | 6.5% |
| t | 373457 | 6.0% |
| o | 330975 | 5.3% |
| s | 293945 | 4.7% |
| c | 244175 | 3.9% |
| Other values (83) | 2114730 |
None
| Value | Count | Frequency (%) |
| Ã | 21 | |
| Â | 10 | 9.7% |
| | 8 | 7.8% |
| â | 8 | 7.8% |
| | 7 | 6.8% |
| © | 7 | 6.8% |
| é | 4 | 3.9% |
| ² | 3 | 2.9% |
| | 3 | 2.9% |
| ¢ | 3 | 2.9% |
| Other values (20) | 29 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 1 | |
| | 1 |
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 18301 |
| Missing (%) | 4.6% |
| Memory size | 3.0 MiB |
| 10+ years | |
|---|---|
| 2 years | |
| < 1 year | |
| 3 years | |
| 5 years | |
| Other values (6) |
Length
| Max length | 9 |
|---|---|
| Median length | 7 |
| Mean length | 7.682830813 |
| Min length | 6 |
Characters and Unicode
| Total characters | 2902028 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 10+ years |
|---|---|
| 2nd row | 4 years |
| 3rd row | < 1 year |
| 4th row | 6 years |
| 5th row | 9 years |
Common Values
| Value | Count | Frequency (%) |
| 10+ years | 126041 | |
| 2 years | 35827 | 9.0% |
| < 1 year | 31725 | 8.0% |
| 3 years | 31665 | 8.0% |
| 5 years | 26495 | 6.7% |
| 1 year | 25882 | 6.5% |
| 4 years | 23952 | 6.0% |
| 6 years | 20841 | 5.3% |
| 7 years | 20819 | 5.3% |
| 8 years | 19168 | 4.8% |
| (Missing) | 18301 | 4.6% |
Length
| Value | Count | Frequency (%) |
| years | 320122 | |
| 10 | 126041 | 16.0% |
| 1 | 57607 | 7.3% |
| year | 57607 | 7.3% |
| 2 | 35827 | 4.6% |
| 31725 | 4.0% | |
| 3 | 31665 | 4.0% |
| 5 | 26495 | 3.4% |
| 4 | 23952 | 3.0% |
| 6 | 20841 | 2.6% |
| Other values (3) | 55301 | 7.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 409454 | ||
| y | 377729 | |
| e | 377729 | |
| a | 377729 | |
| r | 377729 | |
| s | 320122 | |
| 1 | 183648 | |
| 0 | 126041 | 4.3% |
| + | 126041 | 4.3% |
| 2 | 35827 | 1.2% |
| Other values (8) | 189979 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1831038 | |
| Decimal Number | 503770 | 17.4% |
| Space Separator | 409454 | 14.1% |
| Math Symbol | 157766 | 5.4% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 183648 | |
| 0 | 126041 | |
| 2 | 35827 | 7.1% |
| 3 | 31665 | 6.3% |
| 5 | 26495 | 5.3% |
| 4 | 23952 | 4.8% |
| 6 | 20841 | 4.1% |
| 7 | 20819 | 4.1% |
| 8 | 19168 | 3.8% |
| 9 | 15314 | 3.0% |
Lowercase Letter
| Value | Count | Frequency (%) |
| y | 377729 | |
| e | 377729 | |
| a | 377729 | |
| r | 377729 | |
| s | 320122 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 126041 | |
| < | 31725 | 20.1% |
Space Separator
| Value | Count | Frequency (%) |
| 409454 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1831038 | |
| Common | 1070990 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 409454 | ||
| 1 | 183648 | |
| 0 | 126041 | 11.8% |
| + | 126041 | 11.8% |
| 2 | 35827 | 3.3% |
| < | 31725 | 3.0% |
| 3 | 31665 | 3.0% |
| 5 | 26495 | 2.5% |
| 4 | 23952 | 2.2% |
| 6 | 20841 | 1.9% |
| Other values (3) | 55301 | 5.2% |
Latin
| Value | Count | Frequency (%) |
| y | 377729 | |
| e | 377729 | |
| a | 377729 | |
| r | 377729 | |
| s | 320122 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2902028 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 409454 | ||
| y | 377729 | |
| e | 377729 | |
| a | 377729 | |
| r | 377729 | |
| s | 320122 | |
| 1 | 183648 | |
| 0 | 126041 | 4.3% |
| + | 126041 | 4.3% |
| 2 | 35827 | 1.2% |
| Other values (8) | 189979 |
home_ownership
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.0 MiB |
| MORTGAGE | |
|---|---|
| RENT | |
| OWN | |
| OTHER | 112 |
| NONE | 31 |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 5.908327652 |
| Min length | 3 |
Characters and Unicode
| Total characters | 2339875 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | RENT |
|---|---|
| 2nd row | MORTGAGE |
| 3rd row | RENT |
| 4th row | RENT |
| 5th row | MORTGAGE |
Common Values
| Value | Count | Frequency (%) |
| MORTGAGE | 198348 | |
| RENT | 159790 | |
| OWN | 37746 | 9.5% |
| OTHER | 112 | < 0.1% |
| NONE | 31 | < 0.1% |
| ANY | 3 | < 0.1% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| mortgage | 198348 | |
| rent | 159790 | |
| own | 37746 | 9.5% |
| other | 112 | < 0.1% |
| none | 31 | < 0.1% |
| any | 3 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| G | 396696 | |
| E | 358281 | |
| R | 358250 | |
| T | 358250 | |
| O | 236237 | |
| A | 198351 | |
| M | 198348 | |
| N | 197601 | |
| W | 37746 | 1.6% |
| H | 112 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 2339875 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 396696 | |
| E | 358281 | |
| R | 358250 | |
| T | 358250 | |
| O | 236237 | |
| A | 198351 | |
| M | 198348 | |
| N | 197601 | |
| W | 37746 | 1.6% |
| H | 112 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2339875 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| G | 396696 | |
| E | 358281 | |
| R | 358250 | |
| T | 358250 | |
| O | 236237 | |
| A | 198351 | |
| M | 198348 | |
| N | 197601 | |
| W | 37746 | 1.6% |
| H | 112 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2339875 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| G | 396696 | |
| E | 358281 | |
| R | 358250 | |
| T | 358250 | |
| O | 236237 | |
| A | 198351 | |
| M | 198348 | |
| N | 197601 | |
| W | 37746 | 1.6% |
| H | 112 | < 0.1% |
| Distinct | 27197 |
|---|---|
| Distinct (%) | 6.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 74203.1758 |
| Minimum | 0 |
|---|---|
| Maximum | 8706582 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 28000 |
| Q1 | 45000 |
| median | 64000 |
| Q3 | 90000 |
| 95-th percentile | 150000 |
| Maximum | 8706582 |
| Range | 8706582 |
| Interquartile range (IQR) | 45000 |
Descriptive statistics
| Standard deviation | 61637.62116 |
|---|---|
| Coefficient of variation (CV) | 0.8306601503 |
| Kurtosis | 4238.550572 |
| Mean | 74203.1758 |
| Median Absolute Deviation (MAD) | 21000 |
| Skewness | 41.04272475 |
| Sum | 2.938668371 × 1010 |
| Variance | 3799196342 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 60000 | 15313 | 3.9% |
| 50000 | 13303 | 3.4% |
| 65000 | 11333 | 2.9% |
| 70000 | 10674 | 2.7% |
| 40000 | 10629 | 2.7% |
| 45000 | 10114 | 2.6% |
| 80000 | 9971 | 2.5% |
| 75000 | 9850 | 2.5% |
| 55000 | 9195 | 2.3% |
| 90000 | 7573 | 1.9% |
| Other values (27187) | 288075 |
| Value | Count | Frequency (%) |
| 0 | 1 | < 0.1% |
| 600 | 1 | < 0.1% |
| 2500 | 1 | < 0.1% |
| 4000 | 2 | < 0.1% |
| 4080 | 1 | < 0.1% |
| 4200 | 1 | < 0.1% |
| 4524 | 1 | < 0.1% |
| 4800 | 6 | |
| 4888 | 1 | < 0.1% |
| 5000 | 3 |
| Value | Count | Frequency (%) |
| 8706582 | 1 | |
| 7600000 | 1 | |
| 7446395 | 1 | |
| 7141778 | 1 | |
| 7000000 | 1 | |
| 6500000 | 1 | |
| 6100000 | 1 | |
| 6000000 | 2 | |
| 5000000 | 1 | |
| 4900000 | 1 |
verification_status
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.0 MiB |
| Verified | |
|---|---|
| Source Verified | |
| Not Verified |
Length
| Max length | 15 |
|---|---|
| Median length | 12 |
| Mean length | 11.58564503 |
| Min length | 8 |
Characters and Unicode
| Total characters | 4588263 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Not Verified |
|---|---|
| 2nd row | Not Verified |
| 3rd row | Source Verified |
| 4th row | Not Verified |
| 5th row | Verified |
Common Values
| Value | Count | Frequency (%) |
| Verified | 139563 | |
| Source Verified | 131385 | |
| Not Verified | 125082 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| verified | 396030 | |
| source | 131385 | 20.1% |
| not | 125082 | 19.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 923445 | |
| i | 792060 | |
| r | 527415 | |
| V | 396030 | |
| f | 396030 | |
| d | 396030 | |
| o | 256467 | 5.6% |
| 256467 | 5.6% | |
| S | 131385 | 2.9% |
| u | 131385 | 2.9% |
| Other values (3) | 381549 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3679299 | |
| Uppercase Letter | 652497 | 14.2% |
| Space Separator | 256467 | 5.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 923445 | |
| i | 792060 | |
| r | 527415 | |
| f | 396030 | |
| d | 396030 | |
| o | 256467 | 7.0% |
| u | 131385 | 3.6% |
| c | 131385 | 3.6% |
| t | 125082 | 3.4% |
Uppercase Letter
| Value | Count | Frequency (%) |
| V | 396030 | |
| S | 131385 | 20.1% |
| N | 125082 | 19.2% |
Space Separator
| Value | Count | Frequency (%) |
| 256467 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4331796 | |
| Common | 256467 | 5.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 923445 | |
| i | 792060 | |
| r | 527415 | |
| V | 396030 | |
| f | 396030 | |
| d | 396030 | |
| o | 256467 | 5.9% |
| S | 131385 | 3.0% |
| u | 131385 | 3.0% |
| c | 131385 | 3.0% |
| Other values (2) | 250164 | 5.8% |
Common
| Value | Count | Frequency (%) |
| 256467 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4588263 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 923445 | |
| i | 792060 | |
| r | 527415 | |
| V | 396030 | |
| f | 396030 | |
| d | 396030 | |
| o | 256467 | 5.6% |
| 256467 | 5.6% | |
| S | 131385 | 2.9% |
| u | 131385 | 2.9% |
| Other values (3) | 381549 |
| Distinct | 115 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.0 MiB |
| Oct-2014 | 14846 |
|---|---|
| Jul-2014 | 12609 |
| Jan-2015 | 11705 |
| Dec-2013 | 10618 |
| Nov-2013 | 10496 |
| Other values (110) |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
Characters and Unicode
| Total characters | 3168240 |
|---|---|
| Distinct characters | 33 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Jan-2015 |
|---|---|
| 2nd row | Jan-2015 |
| 3rd row | Jan-2015 |
| 4th row | Nov-2014 |
| 5th row | Apr-2013 |
Common Values
| Value | Count | Frequency (%) |
| Oct-2014 | 14846 | 3.7% |
| Jul-2014 | 12609 | 3.2% |
| Jan-2015 | 11705 | 3.0% |
| Dec-2013 | 10618 | 2.7% |
| Nov-2013 | 10496 | 2.7% |
| Jul-2015 | 10270 | 2.6% |
| Oct-2013 | 10047 | 2.5% |
| Jan-2014 | 9705 | 2.5% |
| Apr-2015 | 9470 | 2.4% |
| Sep-2013 | 9179 | 2.3% |
| Other values (105) | 287085 |
Length
| Value | Count | Frequency (%) |
| oct-2014 | 14846 | 3.7% |
| jul-2014 | 12609 | 3.2% |
| jan-2015 | 11705 | 3.0% |
| dec-2013 | 10618 | 2.7% |
| nov-2013 | 10496 | 2.7% |
| jul-2015 | 10270 | 2.6% |
| oct-2013 | 10047 | 2.5% |
| jan-2014 | 9705 | 2.5% |
| apr-2015 | 9470 | 2.4% |
| sep-2013 | 9179 | 2.3% |
| Other values (105) | 287085 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 437232 | |
| 0 | 410549 | |
| 1 | 408204 | |
| - | 396030 | |
| J | 104536 | 3.3% |
| 4 | 102860 | 3.2% |
| u | 102670 | 3.2% |
| a | 98496 | 3.1% |
| 3 | 97662 | 3.1% |
| 5 | 94264 | 3.0% |
| Other values (23) | 915737 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1584120 | |
| Lowercase Letter | 792060 | |
| Dash Punctuation | 396030 | 12.5% |
| Uppercase Letter | 396030 | 12.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| u | 102670 | |
| a | 98496 | |
| e | 85443 | |
| c | 71212 | |
| r | 65142 | |
| n | 64822 | |
| p | 60842 | |
| t | 42130 | 5.3% |
| l | 39714 | 5.0% |
| o | 34068 | 4.3% |
| Other values (4) | 127521 |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 437232 | |
| 0 | 410549 | |
| 1 | 408204 | |
| 4 | 102860 | 6.5% |
| 3 | 97662 | 6.2% |
| 5 | 94264 | 6.0% |
| 6 | 28088 | 1.8% |
| 9 | 3826 | 0.2% |
| 8 | 1240 | 0.1% |
| 7 | 195 | < 0.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 104536 | |
| A | 66039 | |
| M | 63814 | |
| O | 42130 | |
| N | 34068 | 8.6% |
| D | 29082 | 7.3% |
| F | 28742 | 7.3% |
| S | 27619 | 7.0% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 396030 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1980150 | |
| Latin | 1188090 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| J | 104536 | 8.8% |
| u | 102670 | 8.6% |
| a | 98496 | 8.3% |
| e | 85443 | 7.2% |
| c | 71212 | 6.0% |
| A | 66039 | 5.6% |
| r | 65142 | 5.5% |
| n | 64822 | 5.5% |
| M | 63814 | 5.4% |
| p | 60842 | 5.1% |
| Other values (12) | 405074 |
Common
| Value | Count | Frequency (%) |
| 2 | 437232 | |
| 0 | 410549 | |
| 1 | 408204 | |
| - | 396030 | |
| 4 | 102860 | 5.2% |
| 3 | 97662 | 4.9% |
| 5 | 94264 | 4.8% |
| 6 | 28088 | 1.4% |
| 9 | 3826 | 0.2% |
| 8 | 1240 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3168240 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 437232 | |
| 0 | 410549 | |
| 1 | 408204 | |
| - | 396030 | |
| J | 104536 | 3.3% |
| 4 | 102860 | 3.2% |
| u | 102670 | 3.2% |
| a | 98496 | 3.1% |
| 3 | 97662 | 3.1% |
| 5 | 94264 | 3.0% |
| Other values (23) | 915737 |
loan_status
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.0 MiB |
| Fully Paid | |
|---|---|
| Charged Off |
Length
| Max length | 11 |
|---|---|
| Median length | 10 |
| Mean length | 10.19612908 |
| Min length | 10 |
Characters and Unicode
| Total characters | 4037973 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Fully Paid |
|---|---|
| 2nd row | Fully Paid |
| 3rd row | Fully Paid |
| 4th row | Fully Paid |
| 5th row | Charged Off |
Common Values
| Value | Count | Frequency (%) |
| Fully Paid | 318357 | |
| Charged Off | 77673 | 19.6% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| fully | 318357 | |
| paid | 318357 | |
| charged | 77673 | 9.8% |
| off | 77673 | 9.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| l | 636714 | |
| 396030 | ||
| a | 396030 | |
| d | 396030 | |
| F | 318357 | |
| u | 318357 | |
| y | 318357 | |
| P | 318357 | |
| i | 318357 | |
| f | 155346 | 3.8% |
| Other values (6) | 466038 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2849883 | |
| Uppercase Letter | 792060 | 19.6% |
| Space Separator | 396030 | 9.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| l | 636714 | |
| a | 396030 | |
| d | 396030 | |
| u | 318357 | |
| y | 318357 | |
| i | 318357 | |
| f | 155346 | 5.5% |
| h | 77673 | 2.7% |
| r | 77673 | 2.7% |
| g | 77673 | 2.7% |
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 318357 | |
| P | 318357 | |
| C | 77673 | 9.8% |
| O | 77673 | 9.8% |
Space Separator
| Value | Count | Frequency (%) |
| 396030 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3641943 | |
| Common | 396030 | 9.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| l | 636714 | |
| a | 396030 | |
| d | 396030 | |
| F | 318357 | |
| u | 318357 | |
| y | 318357 | |
| P | 318357 | |
| i | 318357 | |
| f | 155346 | 4.3% |
| C | 77673 | 2.1% |
| Other values (5) | 388365 |
Common
| Value | Count | Frequency (%) |
| 396030 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4037973 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| l | 636714 | |
| 396030 | ||
| a | 396030 | |
| d | 396030 | |
| F | 318357 | |
| u | 318357 | |
| y | 318357 | |
| P | 318357 | |
| i | 318357 | |
| f | 155346 | 3.8% |
| Other values (6) | 466038 |
purpose
Categorical
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.0 MiB |
| debt_consolidation | |
|---|---|
| credit_card | |
| home_improvement | |
| other | 21185 |
| major_purchase | 8790 |
| Other values (9) |
Length
| Max length | 18 |
|---|---|
| Median length | 18 |
| Mean length | 14.99784612 |
| Min length | 3 |
Characters and Unicode
| Total characters | 5939597 |
|---|---|
| Distinct characters | 22 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | vacation |
|---|---|
| 2nd row | debt_consolidation |
| 3rd row | credit_card |
| 4th row | credit_card |
| 5th row | credit_card |
Common Values
| Value | Count | Frequency (%) |
| debt_consolidation | 234507 | |
| credit_card | 83019 | 21.0% |
| home_improvement | 24030 | 6.1% |
| other | 21185 | 5.3% |
| major_purchase | 8790 | 2.2% |
| small_business | 5701 | 1.4% |
| car | 4697 | 1.2% |
| medical | 4196 | 1.1% |
| moving | 2854 | 0.7% |
| vacation | 2452 | 0.6% |
| Other values (4) | 4599 | 1.2% |
Length
| Value | Count | Frequency (%) |
| debt_consolidation | 234507 | |
| credit_card | 83019 | 21.0% |
| home_improvement | 24030 | 6.1% |
| other | 21185 | 5.3% |
| major_purchase | 8790 | 2.2% |
| small_business | 5701 | 1.4% |
| car | 4697 | 1.2% |
| medical | 4196 | 1.1% |
| moving | 2854 | 0.7% |
| vacation | 2452 | 0.6% |
| Other values (4) | 4599 | 1.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 789320 | |
| d | 643129 | |
| t | 599957 | |
| i | 593335 | |
| n | 506778 | |
| e | 435403 | |
| c | 420937 | |
| _ | 356376 | 6.0% |
| a | 355447 | 6.0% |
| s | 268302 | 4.5% |
| Other values (12) | 970613 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5583221 | |
| Connector Punctuation | 356376 | 6.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 789320 | |
| d | 643129 | |
| t | 599957 | |
| i | 593335 | |
| n | 506778 | |
| e | 435403 | |
| c | 420937 | |
| a | 355447 | |
| s | 268302 | 4.8% |
| l | 250691 | 4.5% |
| Other values (11) | 719922 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 356376 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5583221 | |
| Common | 356376 | 6.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 789320 | |
| d | 643129 | |
| t | 599957 | |
| i | 593335 | |
| n | 506778 | |
| e | 435403 | |
| c | 420937 | |
| a | 355447 | |
| s | 268302 | 4.8% |
| l | 250691 | 4.5% |
| Other values (11) | 719922 |
Common
| Value | Count | Frequency (%) |
| _ | 356376 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5939597 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 789320 | |
| d | 643129 | |
| t | 599957 | |
| i | 593335 | |
| n | 506778 | |
| e | 435403 | |
| c | 420937 | |
| _ | 356376 | 6.0% |
| a | 355447 | 6.0% |
| s | 268302 | 4.5% |
| Other values (12) | 970613 |
| Distinct | 48817 |
|---|---|
| Distinct (%) | 12.4% |
| Missing | 1755 |
| Missing (%) | 0.4% |
| Memory size | 3.0 MiB |
| Debt consolidation | |
|---|---|
| Credit card refinancing | |
| Home improvement | |
| Other | 12930 |
| Debt Consolidation | 11608 |
| Other values (48812) |
Length
| Max length | 80 |
|---|---|
| Median length | 79 |
| Mean length | 17.24109315 |
| Min length | 2 |
Characters and Unicode
| Total characters | 6797732 |
|---|---|
| Distinct characters | 101 |
| Distinct categories | 15 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 41798 ? |
|---|---|
| Unique (%) | 10.6% |
Sample
| 1st row | Vacation |
|---|---|
| 2nd row | Debt consolidation |
| 3rd row | Credit card refinancing |
| 4th row | Credit card refinancing |
| 5th row | Credit Card Refinance |
Common Values
| Value | Count | Frequency (%) |
| Debt consolidation | 152472 | |
| Credit card refinancing | 51487 | 13.0% |
| Home improvement | 15264 | 3.9% |
| Other | 12930 | 3.3% |
| Debt Consolidation | 11608 | 2.9% |
| Major purchase | 4769 | 1.2% |
| Consolidation | 3852 | 1.0% |
| debt consolidation | 3547 | 0.9% |
| Business | 2949 | 0.7% |
| Debt Consolidation Loan | 2864 | 0.7% |
| Other values (48807) | 132533 |
Length
| Value | Count | Frequency (%) |
| consolidation | 191014 | |
| debt | 190821 | |
| credit | 74290 | 8.4% |
| card | 68254 | 7.7% |
| refinancing | 52262 | 5.9% |
| loan | 28112 | 3.2% |
| home | 22625 | 2.6% |
| improvement | 18786 | 2.1% |
| other | 13252 | 1.5% |
| payoff | 6685 | 0.8% |
| Other values (14633) | 216174 |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 735791 | |
| n | 682851 | 10.0% |
| i | 655694 | 9.6% |
| t | 545268 | 8.0% |
| e | 521004 | 7.7% |
| 494561 | 7.3% | |
| a | 445747 | 6.6% |
| d | 386101 | 5.7% |
| c | 322828 | 4.7% |
| r | 295630 | 4.3% |
| Other values (91) | 1712257 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5749399 | |
| Uppercase Letter | 527589 | 7.8% |
| Space Separator | 494561 | 7.3% |
| Decimal Number | 13723 | 0.2% |
| Other Punctuation | 9147 | 0.1% |
| Dash Punctuation | 1929 | < 0.1% |
| Connector Punctuation | 663 | < 0.1% |
| Close Punctuation | 209 | < 0.1% |
| Currency Symbol | 178 | < 0.1% |
| Open Punctuation | 163 | < 0.1% |
| Other values (5) | 171 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 735791 | |
| n | 682851 | |
| i | 655694 | |
| t | 545268 | |
| e | 521004 | |
| a | 445747 | |
| d | 386101 | |
| c | 322828 | 5.6% |
| r | 295630 | 5.1% |
| s | 262921 | 4.6% |
| Other values (17) | 895564 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 187930 | |
| C | 131422 | |
| L | 26622 | 5.0% |
| H | 24603 | 4.7% |
| O | 23248 | 4.4% |
| P | 18167 | 3.4% |
| M | 17673 | 3.3% |
| R | 13527 | 2.6% |
| B | 11236 | 2.1% |
| I | 9867 | 1.9% |
| Other values (17) | 63294 | 12.0% |
Other Punctuation
| Value | Count | Frequency (%) |
| ! | 2474 | |
| / | 1738 | |
| . | 1647 | |
| ' | 1040 | |
| , | 848 | 9.3% |
| & | 778 | 8.5% |
| % | 143 | 1.6% |
| # | 132 | 1.4% |
| : | 125 | 1.4% |
| " | 108 | 1.2% |
| Other values (5) | 114 | 1.2% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 4028 | |
| 2 | 3446 | |
| 0 | 3203 | |
| 3 | 1306 | 9.5% |
| 4 | 387 | 2.8% |
| 5 | 364 | 2.7% |
| 9 | 317 | 2.3% |
| 6 | 291 | 2.1% |
| 7 | 193 | 1.4% |
| 8 | 188 | 1.4% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 103 | |
| = | 17 | 11.3% |
| ~ | 9 | 6.0% |
| < | 9 | 6.0% |
| > | 8 | 5.3% |
| | | 5 | 3.3% |
Control
| Value | Count | Frequency (%) |
| 11 | ||
| | 2 | 13.3% |
| | 1 | 6.7% |
| 1 | 6.7% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 205 | |
| ] | 4 | 1.9% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 158 | |
| [ | 5 | 3.1% |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 2 | |
| ^ | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 494561 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1929 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 663 |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 178 |
Other Number
| Value | Count | Frequency (%) |
| ³ | 1 |
Other Symbol
| Value | Count | Frequency (%) |
| ¦ | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6276988 | |
| Common | 520744 | 7.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 735791 | |
| n | 682851 | |
| i | 655694 | |
| t | 545268 | 8.7% |
| e | 521004 | 8.3% |
| a | 445747 | 7.1% |
| d | 386101 | 6.2% |
| c | 322828 | 5.1% |
| r | 295630 | 4.7% |
| s | 262921 | 4.2% |
| Other values (44) | 1423153 |
Common
| Value | Count | Frequency (%) |
| 494561 | ||
| 1 | 4028 | 0.8% |
| 2 | 3446 | 0.7% |
| 0 | 3203 | 0.6% |
| ! | 2474 | 0.5% |
| - | 1929 | 0.4% |
| / | 1738 | 0.3% |
| . | 1647 | 0.3% |
| 3 | 1306 | 0.3% |
| ' | 1040 | 0.2% |
| Other values (37) | 5372 | 1.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6797723 | |
| None | 9 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 735791 | |
| n | 682851 | 10.0% |
| i | 655694 | 9.6% |
| t | 545268 | 8.0% |
| e | 521004 | 7.7% |
| 494561 | 7.3% | |
| a | 445747 | 6.6% |
| d | 386101 | 5.7% |
| c | 322828 | 4.7% |
| r | 295630 | 4.3% |
| Other values (84) | 1712248 |
None
| Value | Count | Frequency (%) |
| â | 2 | |
| | 2 | |
| | 1 | |
| ³ | 1 | |
| Ã | 1 | |
| ¦ | 1 | |
| 1 |
| Distinct | 4262 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 17.37951365 |
| Minimum | 0 |
|---|---|
| Maximum | 9999 |
| Zeros | 313 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 4.68 |
| Q1 | 11.28 |
| median | 16.91 |
| Q3 | 22.98 |
| 95-th percentile | 31.58 |
| Maximum | 9999 |
| Range | 9999 |
| Interquartile range (IQR) | 11.7 |
Descriptive statistics
| Standard deviation | 18.01909234 |
|---|---|
| Coefficient of variation (CV) | 1.036800725 |
| Kurtosis | 237923.6765 |
| Mean | 17.37951365 |
| Median Absolute Deviation (MAD) | 5.83 |
| Skewness | 431.0512254 |
| Sum | 6882808.79 |
| Variance | 324.6876889 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 313 | 0.1% |
| 14.4 | 310 | 0.1% |
| 19.2 | 302 | 0.1% |
| 16.8 | 301 | 0.1% |
| 18 | 300 | 0.1% |
| 20.4 | 296 | 0.1% |
| 12 | 293 | 0.1% |
| 13.2 | 291 | 0.1% |
| 21.6 | 270 | 0.1% |
| 15.6 | 266 | 0.1% |
| Other values (4252) | 393088 |
| Value | Count | Frequency (%) |
| 0 | 313 | |
| 0.01 | 8 | < 0.1% |
| 0.02 | 12 | < 0.1% |
| 0.03 | 5 | < 0.1% |
| 0.04 | 5 | < 0.1% |
| 0.05 | 6 | < 0.1% |
| 0.06 | 7 | < 0.1% |
| 0.07 | 7 | < 0.1% |
| 0.08 | 8 | < 0.1% |
| 0.09 | 4 | < 0.1% |
| Value | Count | Frequency (%) |
| 9999 | 1 | |
| 1622 | 1 | |
| 380.53 | 1 | |
| 189.9 | 1 | |
| 145.65 | 1 | |
| 138.03 | 1 | |
| 120.66 | 1 | |
| 107.55 | 1 | |
| 93.86 | 1 | |
| 92.13 | 1 |
| Distinct | 684 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.0 MiB |
| Oct-2000 | 3017 |
|---|---|
| Aug-2000 | 2935 |
| Oct-2001 | 2896 |
| Aug-2001 | 2884 |
| Nov-2000 | 2736 |
| Other values (679) |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
Characters and Unicode
| Total characters | 3168240 |
|---|---|
| Distinct characters | 33 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 45 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Jun-1990 |
|---|---|
| 2nd row | Jul-2004 |
| 3rd row | Aug-2007 |
| 4th row | Sep-2006 |
| 5th row | Mar-1999 |
Common Values
| Value | Count | Frequency (%) |
| Oct-2000 | 3017 | 0.8% |
| Aug-2000 | 2935 | 0.7% |
| Oct-2001 | 2896 | 0.7% |
| Aug-2001 | 2884 | 0.7% |
| Nov-2000 | 2736 | 0.7% |
| Oct-1999 | 2726 | 0.7% |
| Nov-1999 | 2700 | 0.7% |
| Sep-2000 | 2691 | 0.7% |
| Oct-2002 | 2640 | 0.7% |
| Aug-2002 | 2599 | 0.7% |
| Other values (674) | 368206 |
Length
| Value | Count | Frequency (%) |
| oct-2000 | 3017 | 0.8% |
| aug-2000 | 2935 | 0.7% |
| oct-2001 | 2896 | 0.7% |
| aug-2001 | 2884 | 0.7% |
| nov-2000 | 2736 | 0.7% |
| oct-1999 | 2726 | 0.7% |
| nov-1999 | 2700 | 0.7% |
| sep-2000 | 2691 | 0.7% |
| oct-2002 | 2640 | 0.7% |
| aug-2002 | 2599 | 0.7% |
| Other values (674) | 368206 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 416557 | |
| 9 | 402384 | 12.7% |
| - | 396030 | 12.5% |
| 1 | 253612 | 8.0% |
| 2 | 228260 | 7.2% |
| e | 100403 | 3.2% |
| u | 99766 | 3.1% |
| J | 93111 | 2.9% |
| a | 92756 | 2.9% |
| 8 | 78297 | 2.5% |
| Other values (23) | 1007064 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1584120 | |
| Lowercase Letter | 792060 | |
| Dash Punctuation | 396030 | 12.5% |
| Uppercase Letter | 396030 | 12.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 100403 | |
| u | 99766 | |
| a | 92756 | |
| c | 71978 | |
| p | 66904 | |
| n | 61139 | |
| r | 60848 | |
| t | 38291 | 4.8% |
| g | 37349 | 4.7% |
| v | 35583 | 4.5% |
| Other values (4) | 127043 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 416557 | |
| 9 | 402384 | |
| 1 | 253612 | |
| 2 | 228260 | |
| 8 | 78297 | 4.9% |
| 7 | 44922 | 2.8% |
| 4 | 40809 | 2.6% |
| 6 | 40321 | 2.5% |
| 3 | 39568 | 2.5% |
| 5 | 39390 | 2.5% |
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 93111 | |
| A | 66580 | |
| M | 62062 | |
| O | 38291 | |
| S | 37673 | |
| N | 35583 | 9.0% |
| D | 33687 | 8.5% |
| F | 29043 | 7.3% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 396030 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1980150 | |
| Latin | 1188090 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 100403 | 8.5% |
| u | 99766 | 8.4% |
| J | 93111 | 7.8% |
| a | 92756 | 7.8% |
| c | 71978 | 6.1% |
| p | 66904 | 5.6% |
| A | 66580 | 5.6% |
| M | 62062 | 5.2% |
| n | 61139 | 5.1% |
| r | 60848 | 5.1% |
| Other values (12) | 412543 |
Common
| Value | Count | Frequency (%) |
| 0 | 416557 | |
| 9 | 402384 | |
| - | 396030 | |
| 1 | 253612 | |
| 2 | 228260 | |
| 8 | 78297 | 4.0% |
| 7 | 44922 | 2.3% |
| 4 | 40809 | 2.1% |
| 6 | 40321 | 2.0% |
| 3 | 39568 | 2.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3168240 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 416557 | |
| 9 | 402384 | 12.7% |
| - | 396030 | 12.5% |
| 1 | 253612 | 8.0% |
| 2 | 228260 | 7.2% |
| e | 100403 | 3.2% |
| u | 99766 | 3.1% |
| J | 93111 | 2.9% |
| a | 92756 | 2.9% |
| 8 | 78297 | 2.5% |
| Other values (23) | 1007064 |
| Distinct | 61 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11.3111532 |
| Minimum | 0 |
|---|---|
| Maximum | 90 |
| Zeros | 6 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 8 |
| median | 10 |
| Q3 | 14 |
| 95-th percentile | 21 |
| Maximum | 90 |
| Range | 90 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 5.137648808 |
|---|---|
| Coefficient of variation (CV) | 0.4542108766 |
| Kurtosis | 2.966944774 |
| Mean | 11.3111532 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 1.213018844 |
| Sum | 4479556 |
| Variance | 26.39543527 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 9 | 36779 | 9.3% |
| 10 | 35441 | 8.9% |
| 8 | 35137 | 8.9% |
| 11 | 32695 | 8.3% |
| 7 | 31328 | 7.9% |
| 12 | 29157 | 7.4% |
| 6 | 25927 | 6.5% |
| 13 | 24983 | 6.3% |
| 14 | 21173 | 5.3% |
| 5 | 18308 | 4.6% |
| Other values (51) | 105102 |
| Value | Count | Frequency (%) |
| 0 | 6 | < 0.1% |
| 1 | 85 | < 0.1% |
| 2 | 1459 | 0.4% |
| 3 | 4783 | 1.2% |
| 4 | 10709 | 2.7% |
| 5 | 18308 | |
| 6 | 25927 | |
| 7 | 31328 | |
| 8 | 35137 | |
| 9 | 36779 |
| Value | Count | Frequency (%) |
| 90 | 1 | < 0.1% |
| 76 | 2 | < 0.1% |
| 58 | 1 | < 0.1% |
| 57 | 1 | < 0.1% |
| 56 | 2 | < 0.1% |
| 55 | 2 | < 0.1% |
| 54 | 3 | |
| 53 | 6 | |
| 52 | 3 | |
| 51 | 4 |
| Distinct | 20 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.1781910461 |
| Minimum | 0 |
|---|---|
| Maximum | 86 |
| Zeros | 338272 |
| Zeros (%) | 85.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 86 |
| Range | 86 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.5306706005 |
|---|---|
| Coefficient of variation (CV) | 2.978099136 |
| Kurtosis | 1867.466643 |
| Mean | 0.1781910461 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 16.5765642 |
| Sum | 70569 |
| Variance | 0.2816112862 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 338272 | |
| 1 | 49739 | 12.6% |
| 2 | 5476 | 1.4% |
| 3 | 1521 | 0.4% |
| 4 | 527 | 0.1% |
| 5 | 237 | 0.1% |
| 6 | 122 | < 0.1% |
| 7 | 56 | < 0.1% |
| 8 | 34 | < 0.1% |
| 9 | 12 | < 0.1% |
| Other values (10) | 34 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 338272 | |
| 1 | 49739 | 12.6% |
| 2 | 5476 | 1.4% |
| 3 | 1521 | 0.4% |
| 4 | 527 | 0.1% |
| 5 | 237 | 0.1% |
| 6 | 122 | < 0.1% |
| 7 | 56 | < 0.1% |
| 8 | 34 | < 0.1% |
| 9 | 12 | < 0.1% |
| Value | Count | Frequency (%) |
| 86 | 1 | < 0.1% |
| 40 | 1 | < 0.1% |
| 24 | 1 | < 0.1% |
| 19 | 2 | < 0.1% |
| 17 | 1 | < 0.1% |
| 15 | 1 | < 0.1% |
| 13 | 4 | < 0.1% |
| 12 | 4 | < 0.1% |
| 11 | 8 | |
| 10 | 11 |
revol_bal
Real number (ℝ≥0)
| Distinct | 55622 |
|---|---|
| Distinct (%) | 14.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15844.53985 |
| Minimum | 0 |
|---|---|
| Maximum | 1743266 |
| Zeros | 2128 |
| Zeros (%) | 0.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1685 |
| Q1 | 6025 |
| median | 11181 |
| Q3 | 19620 |
| 95-th percentile | 41066.55 |
| Maximum | 1743266 |
| Range | 1743266 |
| Interquartile range (IQR) | 13595 |
Descriptive statistics
| Standard deviation | 20591.83611 |
|---|---|
| Coefficient of variation (CV) | 1.299617174 |
| Kurtosis | 384.2210931 |
| Mean | 15844.53985 |
| Median Absolute Deviation (MAD) | 6112 |
| Skewness | 11.72751512 |
| Sum | 6274913118 |
| Variance | 424023714.3 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2128 | 0.5% |
| 5655 | 41 | < 0.1% |
| 6095 | 38 | < 0.1% |
| 7792 | 38 | < 0.1% |
| 3953 | 37 | < 0.1% |
| 5098 | 36 | < 0.1% |
| 6077 | 36 | < 0.1% |
| 8502 | 35 | < 0.1% |
| 5235 | 35 | < 0.1% |
| 5389 | 35 | < 0.1% |
| Other values (55612) | 393571 |
| Value | Count | Frequency (%) |
| 0 | 2128 | |
| 1 | 30 | < 0.1% |
| 2 | 26 | < 0.1% |
| 3 | 28 | < 0.1% |
| 4 | 20 | < 0.1% |
| 5 | 23 | < 0.1% |
| 6 | 30 | < 0.1% |
| 7 | 21 | < 0.1% |
| 8 | 21 | < 0.1% |
| 9 | 23 | < 0.1% |
| Value | Count | Frequency (%) |
| 1743266 | 1 | |
| 1298783 | 1 | |
| 1190046 | 1 | |
| 1030826 | 1 | |
| 1023940 | 1 | |
| 975800 | 1 | |
| 867528 | 1 | |
| 838698 | 1 | |
| 814300 | 1 | |
| 778614 | 1 |
revol_util
Real number (ℝ≥0)
| Distinct | 1226 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 276 |
| Missing (%) | 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 53.79174864 |
| Minimum | 0 |
|---|---|
| Maximum | 892.3 |
| Zeros | 2213 |
| Zeros (%) | 0.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 11.2 |
| Q1 | 35.8 |
| median | 54.8 |
| Q3 | 72.9 |
| 95-th percentile | 92 |
| Maximum | 892.3 |
| Range | 892.3 |
| Interquartile range (IQR) | 37.1 |
Descriptive statistics
| Standard deviation | 24.45219306 |
|---|---|
| Coefficient of variation (CV) | 0.4545714479 |
| Kurtosis | 2.71227821 |
| Mean | 53.79174864 |
| Median Absolute Deviation (MAD) | 18.5 |
| Skewness | -0.07177802033 |
| Sum | 21288299.69 |
| Variance | 597.9097456 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2213 | 0.6% |
| 53 | 752 | 0.2% |
| 60 | 739 | 0.2% |
| 61 | 734 | 0.2% |
| 55 | 730 | 0.2% |
| 54 | 725 | 0.2% |
| 62 | 721 | 0.2% |
| 47 | 720 | 0.2% |
| 57 | 719 | 0.2% |
| 58 | 717 | 0.2% |
| Other values (1216) | 386984 |
| Value | Count | Frequency (%) |
| 0 | 2213 | |
| 0.01 | 1 | < 0.1% |
| 0.04 | 1 | < 0.1% |
| 0.05 | 1 | < 0.1% |
| 0.1 | 253 | 0.1% |
| 0.16 | 1 | < 0.1% |
| 0.2 | 211 | 0.1% |
| 0.3 | 187 | < 0.1% |
| 0.4 | 189 | < 0.1% |
| 0.46 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 892.3 | 1 | |
| 153 | 1 | |
| 152.5 | 1 | |
| 150.7 | 1 | |
| 148 | 1 | |
| 146.1 | 1 | |
| 145.8 | 1 | |
| 140.4 | 1 | |
| 136.7 | 1 | |
| 132.1 | 1 |
| Distinct | 118 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 25.41474383 |
| Minimum | 2 |
|---|---|
| Maximum | 151 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.0 MiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 9 |
| Q1 | 17 |
| median | 24 |
| Q3 | 32 |
| 95-th percentile | 47 |
| Maximum | 151 |
| Range | 149 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 11.88699072 |
|---|---|
| Coefficient of variation (CV) | 0.4677202651 |
| Kurtosis | 1.204620014 |
| Mean | 25.41474383 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.8643276369 |
| Sum | 10065001 |
| Variance | 141.3005485 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 21 | 14280 | 3.6% |
| 22 | 14260 | 3.6% |
| 20 | 14228 | 3.6% |
| 23 | 13923 | 3.5% |
| 24 | 13878 | 3.5% |
| 19 | 13876 | 3.5% |
| 18 | 13710 | 3.5% |
| 17 | 13495 | 3.4% |
| 25 | 13225 | 3.3% |
| 26 | 12799 | 3.2% |
| Other values (108) | 258356 |
| Value | Count | Frequency (%) |
| 2 | 18 | < 0.1% |
| 3 | 327 | 0.1% |
| 4 | 1238 | 0.3% |
| 5 | 2028 | 0.5% |
| 6 | 2923 | 0.7% |
| 7 | 4143 | |
| 8 | 5365 | |
| 9 | 6362 | |
| 10 | 7672 | |
| 11 | 8844 |
| Value | Count | Frequency (%) |
| 151 | 1 | |
| 150 | 1 | |
| 135 | 1 | |
| 129 | 1 | |
| 124 | 1 | |
| 118 | 1 | |
| 117 | 1 | |
| 116 | 2 | |
| 115 | 1 | |
| 111 | 2 |
initial_list_status
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.0 MiB |
| f | |
|---|---|
| w |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 396030 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | w |
|---|---|
| 2nd row | f |
| 3rd row | f |
| 4th row | f |
| 5th row | f |
Common Values
| Value | Count | Frequency (%) |
| f | 238066 | |
| w | 157964 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| f | 238066 | |
| w | 157964 |
Most occurring characters
| Value | Count | Frequency (%) |
| f | 238066 | |
| w | 157964 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 396030 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| f | 238066 | |
| w | 157964 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 396030 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| f | 238066 | |
| w | 157964 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 396030 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| f | 238066 | |
| w | 157964 |
application_type
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.0 MiB |
| INDIVIDUAL | |
|---|---|
| JOINT | 425 |
| DIRECT_PAY | 286 |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 9.994634245 |
| Min length | 5 |
Characters and Unicode
| Total characters | 3958175 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | INDIVIDUAL |
|---|---|
| 2nd row | INDIVIDUAL |
| 3rd row | INDIVIDUAL |
| 4th row | INDIVIDUAL |
| 5th row | INDIVIDUAL |
Common Values
| Value | Count | Frequency (%) |
| INDIVIDUAL | 395319 | |
| JOINT | 425 | 0.1% |
| DIRECT_PAY | 286 | 0.1% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| individual | 395319 | |
| joint | 425 | 0.1% |
| direct_pay | 286 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| I | 1186668 | |
| D | 790924 | |
| N | 395744 | 10.0% |
| A | 395605 | 10.0% |
| V | 395319 | 10.0% |
| U | 395319 | 10.0% |
| L | 395319 | 10.0% |
| T | 711 | < 0.1% |
| J | 425 | < 0.1% |
| O | 425 | < 0.1% |
| Other values (6) | 1716 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 3957889 | |
| Connector Punctuation | 286 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| I | 1186668 | |
| D | 790924 | |
| N | 395744 | 10.0% |
| A | 395605 | 10.0% |
| V | 395319 | 10.0% |
| U | 395319 | 10.0% |
| L | 395319 | 10.0% |
| T | 711 | < 0.1% |
| J | 425 | < 0.1% |
| O | 425 | < 0.1% |
| Other values (5) | 1430 | < 0.1% |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 286 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3957889 | |
| Common | 286 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| I | 1186668 | |
| D | 790924 | |
| N | 395744 | 10.0% |
| A | 395605 | 10.0% |
| V | 395319 | 10.0% |
| U | 395319 | 10.0% |
| L | 395319 | 10.0% |
| T | 711 | < 0.1% |
| J | 425 | < 0.1% |
| O | 425 | < 0.1% |
| Other values (5) | 1430 | < 0.1% |
Common
| Value | Count | Frequency (%) |
| _ | 286 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3958175 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| I | 1186668 | |
| D | 790924 | |
| N | 395744 | 10.0% |
| A | 395605 | 10.0% |
| V | 395319 | 10.0% |
| U | 395319 | 10.0% |
| L | 395319 | 10.0% |
| T | 711 | < 0.1% |
| J | 425 | < 0.1% |
| O | 425 | < 0.1% |
| Other values (6) | 1716 | < 0.1% |
| Distinct | 33 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 37795 |
| Missing (%) | 9.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.813990816 |
| Minimum | 0 |
|---|---|
| Maximum | 34 |
| Zeros | 139777 |
| Zeros (%) | 35.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 3 |
| 95-th percentile | 6 |
| Maximum | 34 |
| Range | 34 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.147930467 |
|---|---|
| Coefficient of variation (CV) | 1.184091148 |
| Kurtosis | 4.477175726 |
| Mean | 1.813990816 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.600132438 |
| Sum | 649835 |
| Variance | 4.613605292 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 139777 | |
| 1 | 60416 | |
| 2 | 49948 | 12.6% |
| 3 | 38049 | 9.6% |
| 4 | 27887 | 7.0% |
| 5 | 18194 | 4.6% |
| 6 | 11069 | 2.8% |
| 7 | 6052 | 1.5% |
| 8 | 3121 | 0.8% |
| 9 | 1656 | 0.4% |
| Other values (23) | 2066 | 0.5% |
| (Missing) | 37795 | 9.5% |
| Value | Count | Frequency (%) |
| 0 | 139777 | |
| 1 | 60416 | |
| 2 | 49948 | 12.6% |
| 3 | 38049 | 9.6% |
| 4 | 27887 | 7.0% |
| 5 | 18194 | 4.6% |
| 6 | 11069 | 2.8% |
| 7 | 6052 | 1.5% |
| 8 | 3121 | 0.8% |
| 9 | 1656 | 0.4% |
| Value | Count | Frequency (%) |
| 34 | 1 | < 0.1% |
| 32 | 2 | < 0.1% |
| 31 | 2 | < 0.1% |
| 30 | 1 | < 0.1% |
| 28 | 1 | < 0.1% |
| 27 | 3 | < 0.1% |
| 26 | 2 | < 0.1% |
| 25 | 4 | < 0.1% |
| 24 | 10 | |
| 23 | 2 | < 0.1% |
| Distinct | 9 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 535 |
| Missing (%) | 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.1216475556 |
| Minimum | 0 |
|---|---|
| Maximum | 8 |
| Zeros | 350380 |
| Zeros (%) | 88.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 8 |
| Range | 8 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.3561742766 |
|---|---|
| Coefficient of variation (CV) | 2.927919718 |
| Kurtosis | 18.10416044 |
| Mean | 0.1216475556 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.423440368 |
| Sum | 48111 |
| Variance | 0.1268601153 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 350380 | |
| 1 | 42790 | 10.8% |
| 2 | 1847 | 0.5% |
| 3 | 351 | 0.1% |
| 4 | 82 | < 0.1% |
| 5 | 32 | < 0.1% |
| 6 | 7 | < 0.1% |
| 7 | 4 | < 0.1% |
| 8 | 2 | < 0.1% |
| (Missing) | 535 | 0.1% |
| Value | Count | Frequency (%) |
| 0 | 350380 | |
| 1 | 42790 | 10.8% |
| 2 | 1847 | 0.5% |
| 3 | 351 | 0.1% |
| 4 | 82 | < 0.1% |
| 5 | 32 | < 0.1% |
| 6 | 7 | < 0.1% |
| 7 | 4 | < 0.1% |
| 8 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 8 | 2 | < 0.1% |
| 7 | 4 | < 0.1% |
| 6 | 7 | < 0.1% |
| 5 | 32 | < 0.1% |
| 4 | 82 | < 0.1% |
| 3 | 351 | 0.1% |
| 2 | 1847 | 0.5% |
| 1 | 42790 | 10.8% |
| 0 | 350380 |
| Distinct | 393700 |
|---|---|
| Distinct (%) | 99.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.0 MiB |
| USCGC Smith FPO AE 70466 | 8 |
|---|---|
| USS Johnson FPO AE 48052 | 8 |
| USNS Johnson FPO AE 05113 | 8 |
| USS Smith FPO AP 70466 | 8 |
| USNS Johnson FPO AP 48052 | 7 |
| Other values (393695) |
Length
| Max length | 69 |
|---|---|
| Median length | 60 |
| Mean length | 44.71395096 |
| Min length | 20 |
Characters and Unicode
| Total characters | 17708066 |
|---|---|
| Distinct characters | 67 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 391984 ? |
|---|---|
| Unique (%) | 99.0% |
Sample
| 1st row | 0174 Michelle Gateway Mendozaberg, OK 22690 |
|---|---|
| 2nd row | 1076 Carney Fort Apt. 347 Loganmouth, SD 05113 |
| 3rd row | 87025 Mark Dale Apt. 269 New Sabrina, WV 05113 |
| 4th row | 823 Reid Ford Delacruzside, MA 00813 |
| 5th row | 679 Luna Roads Greggshire, VA 11650 |
Common Values
| Value | Count | Frequency (%) |
| USCGC Smith FPO AE 70466 | 8 | < 0.1% |
| USS Johnson FPO AE 48052 | 8 | < 0.1% |
| USNS Johnson FPO AE 05113 | 8 | < 0.1% |
| USS Smith FPO AP 70466 | 8 | < 0.1% |
| USNS Johnson FPO AP 48052 | 7 | < 0.1% |
| USNV Smith FPO AA 00813 | 6 | < 0.1% |
| USCGC Smith FPO AA 70466 | 6 | < 0.1% |
| USCGC Jones FPO AE 22690 | 6 | < 0.1% |
| USNS Johnson FPO AA 70466 | 6 | < 0.1% |
| USNV Smith FPO AE 30723 | 6 | < 0.1% |
| Other values (393690) | 395961 |
Length
| Value | Count | Frequency (%) |
| suite | 88417 | 3.0% |
| apt | 88400 | 3.0% |
| 70466 | 56986 | 2.0% |
| 30723 | 56548 | 1.9% |
| 22690 | 56527 | 1.9% |
| 48052 | 55920 | 1.9% |
| 00813 | 45826 | 1.6% |
| 29597 | 45472 | 1.6% |
| 05113 | 45403 | 1.6% |
| box | 28349 | 1.0% |
| Other values (108604) | 2352838 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2128626 | 12.0% | |
| e | 911545 | 5.1% |
| a | 735427 | 4.2% |
| t | 702787 | 4.0% |
| r | 656748 | 3.7% |
| 0 | 624825 | 3.5% |
| i | 580043 | 3.3% |
| o | 579480 | 3.3% |
| n | 551350 | 3.1% |
| 2 | 487525 | 2.8% |
| Other values (57) | 9749710 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 7690715 | |
| Decimal Number | 4151920 | |
| Uppercase Letter | 2488639 | 14.1% |
| Space Separator | 2128626 | 12.0% |
| Control | 792060 | 4.5% |
| Other Punctuation | 456106 | 2.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 911545 | |
| a | 735427 | |
| t | 702787 | 9.1% |
| r | 656748 | 8.5% |
| i | 580043 | 7.5% |
| o | 579480 | 7.5% |
| n | 551350 | 7.2% |
| s | 471608 | 6.1% |
| l | 400273 | 5.2% |
| h | 341828 | 4.4% |
| Other values (16) | 1759626 |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 295950 | 11.9% |
| S | 274289 | 11.0% |
| P | 164644 | 6.6% |
| M | 161767 | 6.5% |
| C | 157893 | 6.3% |
| N | 148622 | 6.0% |
| D | 106785 | 4.3% |
| L | 105374 | 4.2% |
| W | 94264 | 3.8% |
| R | 93408 | 3.8% |
| Other values (16) | 885643 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 624825 | |
| 2 | 487525 | |
| 3 | 443992 | |
| 6 | 421262 | |
| 7 | 387522 | |
| 1 | 375962 | |
| 9 | 375452 | |
| 5 | 375301 | |
| 4 | 330279 | |
| 8 | 329800 |
Control
| Value | Count | Frequency (%) |
| 396030 | ||
| 396030 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 367706 | |
| . | 88400 | 19.4% |
Space Separator
| Value | Count | Frequency (%) |
| 2128626 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 10179354 | |
| Common | 7528712 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 911545 | 9.0% |
| a | 735427 | 7.2% |
| t | 702787 | 6.9% |
| r | 656748 | 6.5% |
| i | 580043 | 5.7% |
| o | 579480 | 5.7% |
| n | 551350 | 5.4% |
| s | 471608 | 4.6% |
| l | 400273 | 3.9% |
| h | 341828 | 3.4% |
| Other values (42) | 4248265 |
Common
| Value | Count | Frequency (%) |
| 2128626 | ||
| 0 | 624825 | 8.3% |
| 2 | 487525 | 6.5% |
| 3 | 443992 | 5.9% |
| 6 | 421262 | 5.6% |
| 396030 | 5.3% | |
| 396030 | 5.3% | |
| 7 | 387522 | 5.1% |
| 1 | 375962 | 5.0% |
| 9 | 375452 | 5.0% |
| Other values (5) | 1491486 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 17708066 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2128626 | 12.0% | |
| e | 911545 | 5.1% |
| a | 735427 | 4.2% |
| t | 702787 | 4.0% |
| r | 656748 | 3.7% |
| 0 | 624825 | 3.5% |
| i | 580043 | 3.3% |
| o | 579480 | 3.3% |
| n | 551350 | 3.1% |
| 2 | 487525 | 2.8% |
| Other values (57) | 9749710 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| loan_amnt | term | int_rate | installment | grade | sub_grade | emp_title | emp_length | home_ownership | annual_inc | verification_status | issue_d | loan_status | purpose | title | dti | earliest_cr_line | open_acc | pub_rec | revol_bal | revol_util | total_acc | initial_list_status | application_type | mort_acc | pub_rec_bankruptcies | address | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 10000.0 | 36 months | 11.44 | 329.48 | B | B4 | Marketing | 10+ years | RENT | 117000.0 | Not Verified | Jan-2015 | Fully Paid | vacation | Vacation | 26.24 | Jun-1990 | 16.0 | 0.0 | 36369.0 | 41.8 | 25.0 | w | INDIVIDUAL | 0.0 | 0.0 | 0174 Michelle Gateway\r\nMendozaberg, OK 22690 |
| 1 | 8000.0 | 36 months | 11.99 | 265.68 | B | B5 | Credit analyst | 4 years | MORTGAGE | 65000.0 | Not Verified | Jan-2015 | Fully Paid | debt_consolidation | Debt consolidation | 22.05 | Jul-2004 | 17.0 | 0.0 | 20131.0 | 53.3 | 27.0 | f | INDIVIDUAL | 3.0 | 0.0 | 1076 Carney Fort Apt. 347\r\nLoganmouth, SD 05113 |
| 2 | 15600.0 | 36 months | 10.49 | 506.97 | B | B3 | Statistician | < 1 year | RENT | 43057.0 | Source Verified | Jan-2015 | Fully Paid | credit_card | Credit card refinancing | 12.79 | Aug-2007 | 13.0 | 0.0 | 11987.0 | 92.2 | 26.0 | f | INDIVIDUAL | 0.0 | 0.0 | 87025 Mark Dale Apt. 269\r\nNew Sabrina, WV 05113 |
| 3 | 7200.0 | 36 months | 6.49 | 220.65 | A | A2 | Client Advocate | 6 years | RENT | 54000.0 | Not Verified | Nov-2014 | Fully Paid | credit_card | Credit card refinancing | 2.60 | Sep-2006 | 6.0 | 0.0 | 5472.0 | 21.5 | 13.0 | f | INDIVIDUAL | 0.0 | 0.0 | 823 Reid Ford\r\nDelacruzside, MA 00813 |
| 4 | 24375.0 | 60 months | 17.27 | 609.33 | C | C5 | Destiny Management Inc. | 9 years | MORTGAGE | 55000.0 | Verified | Apr-2013 | Charged Off | credit_card | Credit Card Refinance | 33.95 | Mar-1999 | 13.0 | 0.0 | 24584.0 | 69.8 | 43.0 | f | INDIVIDUAL | 1.0 | 0.0 | 679 Luna Roads\r\nGreggshire, VA 11650 |
| 5 | 20000.0 | 36 months | 13.33 | 677.07 | C | C3 | HR Specialist | 10+ years | MORTGAGE | 86788.0 | Verified | Sep-2015 | Fully Paid | debt_consolidation | Debt consolidation | 16.31 | Jan-2005 | 8.0 | 0.0 | 25757.0 | 100.6 | 23.0 | f | INDIVIDUAL | 4.0 | 0.0 | 1726 Cooper Passage Suite 129\r\nNorth Deniseberg, DE 30723 |
| 6 | 18000.0 | 36 months | 5.32 | 542.07 | A | A1 | Software Development Engineer | 2 years | MORTGAGE | 125000.0 | Source Verified | Sep-2015 | Fully Paid | home_improvement | Home improvement | 1.36 | Aug-2005 | 8.0 | 0.0 | 4178.0 | 4.9 | 25.0 | f | INDIVIDUAL | 3.0 | 0.0 | 1008 Erika Vista Suite 748\r\nEast Stephanie, TX 22690 |
| 7 | 13000.0 | 36 months | 11.14 | 426.47 | B | B2 | Office Depot | 10+ years | RENT | 46000.0 | Not Verified | Sep-2012 | Fully Paid | credit_card | No More Credit Cards | 26.87 | Sep-1994 | 11.0 | 0.0 | 13425.0 | 64.5 | 15.0 | f | INDIVIDUAL | 0.0 | 0.0 | USCGC Nunez\r\nFPO AE 30723 |
| 8 | 18900.0 | 60 months | 10.99 | 410.84 | B | B3 | Application Architect | 10+ years | RENT | 103000.0 | Verified | Oct-2014 | Fully Paid | debt_consolidation | Debt consolidation | 12.52 | Jun-1994 | 13.0 | 0.0 | 18637.0 | 32.9 | 40.0 | w | INDIVIDUAL | 3.0 | 0.0 | USCGC Tran\r\nFPO AP 22690 |
| 9 | 26300.0 | 36 months | 16.29 | 928.40 | C | C5 | Regado Biosciences | 3 years | MORTGAGE | 115000.0 | Verified | Apr-2012 | Fully Paid | debt_consolidation | Debt Consolidation | 23.69 | Dec-1997 | 13.0 | 0.0 | 22171.0 | 82.4 | 37.0 | f | INDIVIDUAL | 1.0 | 0.0 | 3390 Luis Rue\r\nMauricestad, VA 00813 |
Last rows
| loan_amnt | term | int_rate | installment | grade | sub_grade | emp_title | emp_length | home_ownership | annual_inc | verification_status | issue_d | loan_status | purpose | title | dti | earliest_cr_line | open_acc | pub_rec | revol_bal | revol_util | total_acc | initial_list_status | application_type | mort_acc | pub_rec_bankruptcies | address | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 396020 | 10000.0 | 36 months | 9.76 | 321.55 | B | B3 | Retirement Counselor | 10+ years | RENT | 40000.0 | Not Verified | Dec-2015 | Fully Paid | debt_consolidation | Debt consolidation | 23.40 | Jan-1988 | 9.0 | 0.0 | 8819.0 | 57.3 | 18.0 | w | INDIVIDUAL | 1.0 | 0.0 | 914 Alexander Mountains Apt. 604\r\nEast Marco, VT 70466 |
| 396021 | 3200.0 | 36 months | 5.42 | 96.52 | A | A1 | St Francis Medical Center | 10+ years | RENT | 33000.0 | Not Verified | Feb-2011 | Fully Paid | debt_consolidation | 2011 Insurance and Debt Consolidation | 21.45 | Nov-1996 | 18.0 | 0.0 | 3985.0 | 7.6 | 50.0 | f | INDIVIDUAL | NaN | 0.0 | 309 John Mission\r\nWest Marc, NY 00813 |
| 396022 | 12000.0 | 36 months | 12.29 | 400.24 | C | C1 | Data Center Specialist II | 1 year | RENT | 52100.0 | Source Verified | Oct-2015 | Fully Paid | debt_consolidation | Debt consolidation | 17.28 | Oct-2004 | 6.0 | 0.0 | 9580.0 | 66.1 | 18.0 | w | INDIVIDUAL | 0.0 | 0.0 | 532 Johnson Drive Apt. 185\r\nAndersonside, NY 70466 |
| 396023 | 22000.0 | 36 months | 18.92 | 805.55 | D | D4 | Operations Manager | 10+ years | MORTGAGE | 138000.0 | Not Verified | Apr-2014 | Fully Paid | debt_consolidation | Debt consolidation | 24.43 | May-1998 | 18.0 | 0.0 | 22287.0 | 50.4 | 39.0 | f | INDIVIDUAL | 4.0 | 0.0 | 0297 Flores Dale Suite 441\r\nTaylorland, MD 05113 |
| 396024 | 6000.0 | 36 months | 13.11 | 202.49 | B | B4 | Michael's Arts & Crafts | 5 years | RENT | 64000.0 | Not Verified | Mar-2013 | Fully Paid | debt_consolidation | Credit buster | 10.81 | Nov-1991 | 7.0 | 0.0 | 11456.0 | 97.1 | 9.0 | w | INDIVIDUAL | 0.0 | 0.0 | 514 Cynthia Park Apt. 402\r\nWest Williamside, SC 05113 |
| 396025 | 10000.0 | 60 months | 10.99 | 217.38 | B | B4 | licensed bankere | 2 years | RENT | 40000.0 | Source Verified | Oct-2015 | Fully Paid | debt_consolidation | Debt consolidation | 15.63 | Nov-2004 | 6.0 | 0.0 | 1990.0 | 34.3 | 23.0 | w | INDIVIDUAL | 0.0 | 0.0 | 12951 Williams Crossing\r\nJohnnyville, DC 30723 |
| 396026 | 21000.0 | 36 months | 12.29 | 700.42 | C | C1 | Agent | 5 years | MORTGAGE | 110000.0 | Source Verified | Feb-2015 | Fully Paid | debt_consolidation | Debt consolidation | 21.45 | Feb-2006 | 6.0 | 0.0 | 43263.0 | 95.7 | 8.0 | f | INDIVIDUAL | 1.0 | 0.0 | 0114 Fowler Field Suite 028\r\nRachelborough, LA 05113 |
| 396027 | 5000.0 | 36 months | 9.99 | 161.32 | B | B1 | City Carrier | 10+ years | RENT | 56500.0 | Verified | Oct-2013 | Fully Paid | debt_consolidation | pay off credit cards | 17.56 | Mar-1997 | 15.0 | 0.0 | 32704.0 | 66.9 | 23.0 | f | INDIVIDUAL | 0.0 | 0.0 | 953 Matthew Points Suite 414\r\nReedfort, NY 70466 |
| 396028 | 21000.0 | 60 months | 15.31 | 503.02 | C | C2 | Gracon Services, Inc | 10+ years | MORTGAGE | 64000.0 | Verified | Aug-2012 | Fully Paid | debt_consolidation | Loanforpayoff | 15.88 | Nov-1990 | 9.0 | 0.0 | 15704.0 | 53.8 | 20.0 | f | INDIVIDUAL | 5.0 | 0.0 | 7843 Blake Freeway Apt. 229\r\nNew Michael, FL 29597 |
| 396029 | 2000.0 | 36 months | 13.61 | 67.98 | C | C2 | Internal Revenue Service | 10+ years | RENT | 42996.0 | Verified | Jun-2010 | Fully Paid | debt_consolidation | Toxic Debt Payoff | 8.32 | Sep-1998 | 3.0 | 0.0 | 4292.0 | 91.3 | 19.0 | f | INDIVIDUAL | NaN | 0.0 | 787 Michelle Causeway\r\nBriannaton, AR 48052 |